Mongodb to SQL Migration 플랜

기존의 고객 데이터베이스들을 SQL 기반 DB로 옮기는 것 까지는 기정사실이긴 한데, 구체적으로 어느 SQL로 가야 하는가? Postgres, MySQL, MariaDB, Oracle, Microsoft, SQLite 이렇게 다양한데 판단의 기준을 세워야 할 것 같다.

판단의 기준

1. 가격

엔터프라이즈 DB는 정말 더럽게 비싸다. 목표는 MongoDB보다 저렴하고 AWS 비용까지 고려했을 때 합리적인 가격이어야 한다.

https://aws.amazon.com/rds/pricing/

DB Type	Instances	Price Per Hour
RDS for Oracle License Included (LI) Single-AZ	db.t3.small	$0.088
RDS for Oracle License Included (LI) Multi-AZ	db.t3.small	$0.176
RDS for Oracle BYOL ^[1] Single-AZ Deployment	db.t3.small	$0.052
RDS for Oracle BYOL ^[1:1] Multi-AZ Deployment	db.t3.small	$0.104
RDS for MySQL Single-AZ Deployment	db.t4g.micro	$0.025
RDS for MySQL Single-AZ Deployment	db.t4g.small	$0.051
RDS for MySQL Multi-AZ Deployment	db.t4g.micro	$0.051
RDS for MySQL Multi-AZ Deployment	db.t4g.small	$0.102
Amazon RDS for PostgreSQL Single-AZ Deployment	db.t4g.micro	$0.025
Amazon RDS for PostgreSQL Single-AZ Deployment	db.t4g.small	$0.051
Amazon RDS for PostgreSQL Multi-AZ Deployment	db.t4g.micro	$0.051
Amazon RDS for PostgreSQL Multi-AZ Deployment	db.t4g.small	$0.102
RDS for MariaDB Single-AZ Deployment	db.t4g.micro	$0.025
RDS for MariaDB Single-AZ Deployment	db.t4g.small	$0.051
RDS for MariaDB Multi-AZ Deployment	db.t4g.micro	$0.051
RDS for MariaDB Multi-AZ Deployment	db.t4g.small	$0.102
RDS for SQL Server On-Demand Instances	db.t3.micro	$0.031
RDS for SQL Server On-Demand Instances	db.t3.small	$0.062

AWS Aurora는 MySQL, PostgreSQL 호환이 되는 고성능 데이터베이스 서비스이다. 두가지 버전이 있는데, IO에 요금을 부과하는 Aurora Standard와 스토리지 저장용량에 요금을 부과하는 Aurora I/O-Optimized 서비스가 존재한다.

AWS Aurora는 다양한 Capacity 옵션이 존재한다. Serverless, Provisioned On-Demand Instance, Provisioned Reserved Instance, Limitless.

AWS Aurora Serverless 옵션은 서버리스 특성상 자동으로 스케일링을 조절한다. 요청이 없으면 스스로 꺼지고 갑자기 많은 양의 트랜잭션이 요청되면 자원을 수평적으로 할당한다. Provisioned 옵션과 달리 항시 켜져있지 않기 때문에 사용시간을 특별한 ACU (Aurora Capacity Units) 단위를 가지고 요금을 책정하는데, 1초 단위로 아주 정밀하게 요금을 계산한다.

Measure	Aurora Standard (per ACU hour)	Aurora I/O-Optimized (per ACU hour)
Aurora Capacity Unit	$0.20	$0.26

서버리스가 싫으면 Provisioned On-Demand Instance 혹은 Provisioned Reserved Instance를 사용하면 된다. 전자는 일반적인 RDS 인스턴스를 월 단위로 지불하는 방식이고, Reserved Instance는 한 번에 몇개월/몇년치 금액을 지불하여 쓰는 방식이다.

Standard Instances - Current Generation	Aurora Standard (Price Per Hour)	Aurora I/O-Optimized (Price Per Hour)
db.t4g.medium	$0.113	$0.147
db.t4g.large	$0.227	$0.295
db.t3.medium	$0.125	$0.163
db.t3.large	$0.25	$0.325

2. 기술지원

돈을 많이 내면 (엔터프라이즈 요금제를 사용하면) 각종 기술지원 및 사후지원이 제공된다. PostgreSQL도 엔터프라이즈 고객에 대한 기술지원이 제공된다고 하는데, 확실히 날 것 그대로의 오픈소스 보다는 기술보증이 탄탄하니까 돈이 오가는 크리티컬한 업계라면 고려해볼만하다.

3. 성능

어차피 NoSQL에서 넘어오는데... 아무 SQL DB를 사용해도 성능향상은 보장된다.

4. ORDB

PostgreSQL 때문에 고민이다. 얘는 그냥 RDBMS가 아니라 ORDBMS라서 "객체" 개념이 들어간다. Schema에 Function이 달릴 수 있는 영락없는 객체라고. 이걸로 어떤 차별성을 가질 수 있는지도 잘 모르겠고, 우리 도메인에 ORDB가 필요한지도 알아내야 한다.

5. Type

Enum, Uuid, Decimal, Json, Array같은 타입을 우리 서비스에서 사용하는가? 그러면 가능한 최대한 많은 타입을 지원하는 DB를 써야 유리할 것이다. 안 그러면 마이그레이션할 때 변환 로직을 추가해야 할 것이다.

2025-03-22 edited

PostgreSQL 쪽으로 선호도가 기울고 있다. 가장 큰 건 Json 타입으로, 값객체를 별도의 테이블이 아닌 자체의 컬럼으로 관리할 수 있을 뿐더러, JSON 프로퍼티를 쿼리하는 문법도 존재하기 때문에 --물론 이 방법은 아마도 raw query를 써야겠지만-- 마이그레이션에 상당한 이점을 가져갈 것으로 보인다. 냅다 마이그레이션을 할 수는 없는게, MongoDB의 ObjectID를 어떻게 PK 필드로 전환할 수 있을지와 릴레이션이 끊어진 테이블 간의 관계도를 재구축하여야 한다.

AI Response :: Claude 3.7 Sonnet

Based on your existing backend system built with NestJS, MongoDB, and Prisma ORM, I'll outline a comprehensive strategy for migrating to a SQL-based database. From your requirements and the notes in the paste.txt document, PostgreSQL appears to be the preferred option, so I'll focus primarily on that while highlighting considerations for other SQL databases.

1. PostgreSQL as the Primary Target

PostgreSQL offers several advantages that align with your needs:

JSON Support: PostgreSQL's robust JSONB type can help manage complex nested data structures currently in MongoDB, enabling easier migration of value objects
Type System: Rich type system including arrays, UUIDs, enums, and custom types
Indexing Capabilities: Comprehensive indexing including partial and expression indexes
Cost-Effective: Comparable pricing to other open-source SQL options on AWS RDS
Object-Relational Features: ORDBMS capabilities provide flexibility for complex domain modeling

2. Migration Strategy Overview

Phase 1: Schema Design & Preparation

Map Document Schema to Relational Schema
- Convert MongoDB collections to SQL tables
- Define explicit relations between entities
- Design primary and foreign key relationships

Handle ObjectId Migration

Option 1: Convert ObjectIds to UUIDs (preferred for PostgreSQL)

-- PostgreSQL schema example
CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  username VARCHAR(255) NOT NULL,
  /* other fields */
);

Option 2: Preserve original ObjectId strings in a character field

-- For compatibility during transition
CREATE TABLE users (
  id VARCHAR(24) PRIMARY KEY,
  mongo_id VARCHAR(24) UNIQUE, -- Store original during migration
  /* other fields */
);

JSON Field Planning

Map embedded MongoDB documents to PostgreSQL JSONB fields
Example for address object in user model:

CREATE TABLE users (
  id UUID PRIMARY KEY,
  /* other fields */
  address JSONB -- Store complex address object
);

Schema Validation
- Create a test schema and validate with sample data
- Confirm all domain requirements are fulfilled by the relational design

Phase 2: Implementation & Migration

Update Prisma Schema

Modify your Prisma schema to work with PostgreSQL
Add explicit relations and constraints
Example:

model User {
  id        String   @id @default(uuid())
  name      String
  email     String   @unique
  tickets   Ticket[]
  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt
  address   Json?    // For complex MongoDB documents
}

Database Migration Tooling
- Use a combination of custom scripts and tools like MongoDB Compass for export
- Consider leveraging AWS Database Migration Service if using AWS infrastructure
- For complex migrations, implement a custom ETL pipeline
Data Transformation Scripts
- Create scripts to handle data transformation between document and relational models
- Transform ObjectIds to UUIDs/strings
- Resolve embedded documents to related tables
- Handle arrays and complex MongoDB-specific structures
Incremental Migration Approach
- Migrate individual collections/modules separately
- Start with less complex, less critical modules
- Implement a dual-write strategy during transition
API & Service Layer Updates
- Modify service classes to handle SQL-specific operations
- Update repository patterns/DAOs to work with relational data
- Keep service interfaces consistent for client compatibility

Phase 3: Verification & Optimization

Testing Strategy
- Create comprehensive test suites for migrated functionality
- Implement parallel testing against both databases
- Performance testing to validate SQL performance gains

Indexing Optimization

Identify high-volume queries from current system
Implement appropriate indexes for these queries
Example:

CREATE INDEX idx_user_username ON users(username);
CREATE INDEX idx_academy_owner ON academies(owner_id);
CREATE INDEX idx_ticket_plan_type ON ticket_plans(type);

Query Optimization
- Analyze and optimize complex queries
- Consider denormalization for performance-critical paths
- Implement caching strategies where appropriate

3. Specific Migration Challenges & Solutions

Complex Document Structures

Challenge: MongoDB's flexible schema allows deeply nested documents and arrays.

Solution:

Use PostgreSQL's JSONB for complex value objects that don't need to be queried often
Normalize important nested structures into related tables
For arrays, consider junction tables or PostgreSQL array types based on use case

Example conversion:

// MongoDB document
{
  _id: ObjectId("..."),
  name: "Tennis Academy",
  address: {
    detail: "123 Main St",
    location: {
      coordinates: [127.1, 37.5],
      type: "Point"
    },
    oldAddress: "Old address",
    roadAddress: "Road address"
  },
  images: ["url1", "url2"]
}

PostgreSQL approach:

CREATE TABLE academies (
  id UUID PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  address JSONB,  -- Store complex address object as JSONB
  images TEXT[]   -- Use PostgreSQL array for simple string arrays
);

ObjectId References

Challenge: MongoDB uses ObjectId references between collections.

Solution:

Create explicit foreign key relationships
Use UUIDs for new records
Map original ObjectIds to new primary keys during migration

CREATE TABLE academies (
  id UUID PRIMARY KEY,
  -- other fields
);

CREATE TABLE courts (
  id UUID PRIMARY KEY,
  academy_id UUID NOT NULL REFERENCES academies(id),
  -- other fields
);

Handling Transactions

Challenge: Moving from MongoDB's document-level atomicity to SQL transactions.

Solution:

Use PostgreSQL's transaction capabilities for multi-table operations
Update service layer to use transactions for operations that modify multiple entities
Example in NestJS with Prisma:

async createAcademyWithCourts(data: CreateAcademyDto): Promise<Academy> {
  return this.prisma.$transaction(async (tx) => {
    const academy = await tx.academy.create({
      data: {
        name: data.name,
        // other fields
      },
    });
    
    for (const court of data.courts) {
      await tx.court.create({
        data: {
          name: court.name,
          academyId: academy.id,
          // other fields
        },
      });
    }
    
    return academy;
  });
}

4. Technical Implementation Details

Schema Conversion Examples

Here are specific schema conversion examples based on your Prisma models:

MongoDB Academy Model:

model Academy {
  id                 String               @id @default(auto()) @map("_id") @db.ObjectId
  name               String
  owner              String               @db.ObjectId
  // other fields
  address            AcademymodelsAddress
  // relations
  Tag               AcademyTag[]
  TicketPlan        TicketPlan[]
}

PostgreSQL Conversion:

model Academy {
  id                 String               @id @default(uuid())
  name               String
  owner              String               @db.Uuid
  // other fields
  address            Json                 // Store as JSONB in PostgreSQL
  // relations with explicit relations
  tags               AcademyTag[]
  ticketPlans        TicketPlan[]
  
  @@index([name])
  @@index([owner])
}

NestJS Service Layer Updates

Your NestJS services will need modifications to handle SQL-specific operations:

// Before (MongoDB)
async findUserTickets(userId: string): Promise<Ticket[]> {
  return this.prisma.ticket.findMany({
    where: {
      user: userId, // ObjectId reference
    },
  });
}

// After (PostgreSQL)
async findUserTickets(userId: string): Promise<Ticket[]> {
  return this.prisma.ticket.findMany({
    where: {
      userId: userId, // Explicit foreign key
    },
    include: {
      ticketPlan: true, // Explicitly include relations
    },
  });
}

Indexing Strategy

Based on your schema and likely query patterns:

Primary Indexes:
- All primary keys and foreign keys
Performance Indexes:
- User lookups: username, phone, email
- Academy lookups: name, owner, location (PostgreSQL GiST index for geospatial)
- Ticket queries: user, status, expireTime
- Payment queries: orderId, status, createdTime
Composite Indexes:
- (academy, court, from, to) for reservation availability checks
- (user, ticketPlan, isExpired) for active ticket queries

-- Example indexes for PostgreSQL
CREATE INDEX idx_ticket_user_status ON tickets(user_id, status);
CREATE INDEX idx_reservation_time ON reservations(court_id, start_time, end_time);
CREATE INDEX idx_academy_location ON academies USING GIST float, (address->>'latitude')::float), 4326);

5. PostgreSQL-Specific Advantages

JSONB Operations:

Query inside JSON structures:

SELECT * FROM academies WHERE address->'location'->'coordinates'->>0 > '127.0';

Index JSON properties:

CREATE INDEX idx_academy_city ON academies ((address->>'city'));

Array Operations:

Query array elements:

SELECT * FROM courts WHERE 'basketball' = ANY(sports);

Custom Types and Functions:

Create domain-specific types:

CREATE TYPE reservation_status AS ENUM ('pending', 'confirmed', 'cancelled');

Implement business logic in database functions:

CREATE FUNCTION check_reservation_availability(court_id UUID, start_time TIMESTAMP, end_time TIMESTAMP) 
RETURNS BOOLEAN AS $
-- Implementation
$ LANGUAGE plpgsql;

6. Migration Timeline and Milestones

Preparation Phase (3-4 weeks)
- Schema design and mapping
- Test environment setup
- Migration script development
Testing Phase (2-3 weeks)
- Data migration verification
- Performance testing
- API compatibility testing
Incremental Migration (6-10 weeks)
- Migrate modules in order: Users → Academies → Courts → Ticket Plans → Tickets → Reservations → Payments
- Implement dual-write for critical paths
- Incremental cutover by module
Optimization Phase (2-4 weeks)
- Performance tuning
- Index optimization
- Query refinement
Monitoring and Verification (2 weeks)
- Extended monitoring period
- Performance baseline establishment
- Final optimization adjustments

7. Cost Considerations

Based on your workload:

PostgreSQL on RDS:
- Starting with a db.t4g.small ($0.051/hour) for development
- Scaling to db.t4g.medium/large for production
- Multi-AZ for production environment ($0.102/hour)
PostgreSQL Aurora:
- Consider Serverless v2 for variable workloads
- Set minimum and maximum ACU based on typical load patterns
- Estimated cost: $0.20-0.26 per ACU hour
Migration Costs:
- Development time for migration scripts
- Potential dual-running costs during transition
- AWS DMS costs if using that service

Conclusion

PostgreSQL offers the best balance of features, performance, and cost-effectiveness for your migration from MongoDB. The JSONB type provides a bridge between document and relational models, making the migration path smoother while giving you all the benefits of a robust relational database.

The migration strategy outlined above prioritizes:

Data integrity and consistency
Minimal disruption to services
Incremental approach to reduce risk
Performance optimization
Leveraging PostgreSQL's unique features to maintain MongoDB-like flexibility where needed

By following this approach, you can successfully transition from MongoDB to PostgreSQL while improving your system's performance, maintainability, and cost-effectiveness.

AI Response :: o3-mini-high

Below is a comprehensive plan for migrating from MongoDB to a SQL-based system using Prisma ORM. This plan assumes you already have a well-structured Prisma schema for MongoDB and outlines the steps from schema conversion to data migration and verification.

1. Assess and Redesign the Data Model

Review Your Existing Schema:
Since your MongoDB schema is structured via Prisma, start by reviewing your current models. Identify:
- Collections (models) and their fields.
- Data types that differ between MongoDB and SQL (for example, MongoDB’s ObjectID vs. SQL’s UUID or auto-incrementing integer).
- Any embedded documents or arrays that might require normalization in a relational model.
Redesign for Relational Constraints:
- Convert embedded documents into separate tables if needed.
- Define explicit relationships (one-to-many, many-to-many) using foreign keys.
- Adjust indexes and constraints (unique, not null, etc.) to suit a relational engine.

2. Update the Prisma Schema

Change the Provider:
In your schema.prisma file, switch the provider from "mongodb" to your chosen SQL provider (e.g., "postgresql", "mysql", or "sqlite"). For example:
```
datasource db {
  provider = "postgresql" // or "mysql" / "sqlite"
  url      = env("DATABASE_URL")
}
```
Adapt Model Definitions:
- Update field types where necessary (for example, converting String @id @default(auto()) from MongoDB’s ObjectID to an appropriate SQL equivalent, such as using @id @default(autoincrement()) for integer keys or @id @default(uuid()) for UUIDs).
- Define relation fields using Prisma’s relational syntax.
Regenerate the Prisma Client:
Once the schema is updated, run:
```
npx prisma generate
```
This will ensure your client is in sync with the updated schema.

3. Set Up the SQL Database Environment

Provision a New SQL Instance:
Set up your SQL database (e.g., PostgreSQL or MySQL) and update your environment variables (DATABASE_URL) accordingly.
Run Prisma Migrations:
Use Prisma Migrate to create and apply migration files that build the new SQL schema:
```
npx prisma migrate dev --name init
```
This creates the SQL schema in your new database.

4. Plan and Execute Data Migration

Data Extraction:
Write a script (using Node.js and the Prisma Client) that connects to your MongoDB instance, reads the data, and prepares it for insertion into the SQL database.
Data Transformation:
- Convert data types (for example, format dates, convert MongoDB ObjectIDs, and flatten nested documents if necessary).
- Handle relational data by ensuring that foreign key relationships are correctly established. This might involve a two-phase approach: first inserting “parent” records, then “child” records.

Data Loading:
Write to your new SQL database using Prisma Client. Depending on the volume:

For smaller datasets, a straightforward ETL script might suffice.
For larger datasets, consider batching the operations and using transactions to maintain consistency.

Example (simplified):

const { PrismaClient } = require('@prisma/client');
const prismaMongo = new PrismaClient({ datasources: { db: { url: process.env.MONGO_URL } } });
const prismaSQL = new PrismaClient({ datasources: { db: { url: process.env.SQL_URL } } });

async function migrateData() {
  // Extract data from MongoDB
  const mongoRecords = await prismaMongo.myCollection.findMany();
  
  // Transform and load into SQL
  for (const record of mongoRecords) {
    await prismaSQL.myCollection.create({
      data: {
        // Map and transform fields as needed
        name: record.name,
        createdAt: new Date(record.createdAt),
        // ...other fields...
      },
    });
  }
}

migrateData()
  .then(() => console.log('Migration completed'))
  .catch((e) => console.error(e))
  .finally(async () => {
    await prismaMongo.$disconnect();
    await prismaSQL.$disconnect();
  });

Testing the Migration:
- Verify that all records have been transferred accurately.
- Compare record counts and perform spot-checks on key data.
- Ensure relational integrity is maintained.

5. Update Application Code and Deployment

Switch the Database Connection:
Change your application’s configuration to point to the new SQL database. Make sure all environment variables and Prisma client initializations reference the correct DATABASE_URL.
Test the Application:
Run end-to-end tests to ensure that your application works with the SQL backend. Check all CRUD operations, relations, and any SQL-specific features.
Plan for Rollback:
Keep the MongoDB instance running in parallel (if possible) until you are confident that the SQL database is fully operational. Have a rollback plan in case issues arise.

6. Monitor and Optimize

Performance Monitoring:
After migration, monitor database performance. SQL databases have different indexing and query optimization strategies than MongoDB.
Optimize Indexes and Queries:
Based on your application’s usage patterns, consider adjusting indexes and query structures to take full advantage of the relational database engine.
Document Learnings:
Record any challenges or adjustments made during the migration process to aid future migrations or troubleshooting.

Summary

This migration plan involves careful redesign of your data model, updating the Prisma schema, setting up a new SQL environment, and writing an ETL process for data migration. Testing at every stage is crucial to ensure data integrity and application stability. Once the migration is complete, updating your application configuration and monitoring performance will ensure a smooth transition from MongoDB to your new SQL-based system.

Feel free to reach out with any follow-up questions or for further details on specific parts of the migration process.

BYOL - Bring Your Own Licence, 이미 해당 라이선스를 소유하고 있을 경우 ↩︎ ↩︎